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Abstract 

Background: Severe refractory asthma is a heterogeneous disease. We sought to determine statistical clusters from the 
British Thoracic Society Severe refractory Asthma Registry and to examine cluster-specific outcomes and stability. 

Methods: Factor analysis and statistical cluster modelling was undertaken to determine the number of clusters and their 
membership (N = 349). Cluster-specific outcomes were assessed after a median follow-up of 3 years. A classifier was 
programmed to determine cluster stability and was validated in an independent cohort of new patients recruited to the 
registry (n = 245). 

Findings: Five clusters were identified. Cluster 1 (34%) were atopic with early onset disease, cluster 2 (21%) were obese with 
late onset disease, cluster 3 (15%) had the least severe disease, cluster 4 (15%) were the eosinophilic with late onset disease 
and cluster 5 (15%) had significant fixed airflow obstruction. At follow-up, the proportion of subjects treated with oral 
corticosteroids increased in all groups with an increase in body mass index. Exacerbation frequency decreased significantly 
in clusters 1, 2 and 4 and was associated with a significant fall in the peripheral blood eosinophil count in clusters 2 and 4. 
Stability of cluster membership at follow-up was 52% for the whole group with stability being best in cluster 2 (71%) and 
worst in cluster 4 (25%). In an independent validation cohort, the classifier identified the same 5 clusters with similar patient 
distribution and characteristics. 

Interpretation: Stat\sX\ca\ cluster analysis can identify distinct phenotypes with specific outcomes. Cluster membership can 
be determined using a classifier, but when treatment is optimised, cluster stability is poor. 
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Introduction 

Severe refractory asthma is a heterogeneous multi-dimensional 
disease and is a consequence of a variety of pathophysiological 
mechanisms driven by complex interactions between the host and 
environment [1-4]. Algorithmic cluster analysis, such as hierar- 
chical and k-means clustering, has been used to determine severe 
or severe refractory asthma phenotypes in adults [5-1 1] and 
children [12] and these clusters have demonstrated clinical utility 
with cluster-specific response to therapy [1]. However, these 
analyses have several limitations including difficulties in determin- 



ing the best fitting number of clusters, as they are algorithmic by 
nature and do not rely on statistical inference to determine this 
number [13]. More importantly, the longitudinal phenotypic 
stabihty of described clusters in a clinical setting has not been 
previously examined. One challenge is that most approaches to 
clustering can only be apphed to the data at a single time point and 
thus may derive new and additional clusters at another time point 
in the same population making the assessment of the repeatability 
of cluster membership and phenotype stability impossible. 

An alternative approach to determine partitioning of multivar- 
iate data is to use statistical mixture models [14] and these have 
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been applied in several disciplines such as gene-expression [15], 
neurophysiology [16], and magnetic resonance imaging [17]. 
Mixture models are a family of statistical models that determine 
the best fitting number of clusters or mixtures by comparing 
models with dilfering number of mixtures or clusters using model 
selection criteria to choose the best-fitting model such as the 
Akaike information criterion [18] and the Bayesian information 
criterion [19]. These models are less sensitive to oudiers and less 
sensitive to over fitting the data values than k-means and 
hierarchical clustering [20]. In addition, classifiers that determine 
cluster membership can be applied to the same population 
repeatedly and across cohorts to define phenotypic stability. 

Here we report in a multi-centre longitudinal observational 
study of severe refractor)- asthma the apphcation of statistical 
clustering, determine phenotypic-specific changes over time and 
define the consequent affects upon phenotype stability. 

Materials and Methods 

Ethics statement 

Written informed consent for data entry and analysis from the 
multi centres was obtained from all subjects and the study was 
approved by each centre's local ethics committee. Ethical approval 
for the British Thoracic Society Difficult Asthma Registry across 
sites was obtained from the Office for Research Ethics Committees 
Northern Ireland (number 10/NIR02/37). In Leicester additional 
ethics was also obtained from the Leicestershire, Northampton- 
shire and Rudand ethics committee (REC 6307). 

Subjects and protocol 

Subjects were recruited from centres contributing data to the 
British Thoracic Society Severe refractory Asthma Registry [21- 
25]. Subjects were assessed using a standardised protocol that 
included the recording of demographics, pulmonary function tests, 
allergy assessment, and standardised blood panel, including 
peripheral blood eosinophil count and total serum immunoglobuin 
(IgE). Data was entered into an anonymised, secure, web-based 
registry hosted by Dendrite Clinical Systems Ltd. Subjects in the 
first dataset were from 4 centres the Regional Severe refractory 
Asthma Service, Belfast City Hospital, Institute for Lung Health, 
Glenfield Hospital, Leicester, Royal Brompton Hospital, London 
and North West Lung Centre, Manchester. These subjects were 
assessed at baseline and after a minimum of 1 year follow-up. 
Subjects were clinically managed in accordance with local and 
national guidelines to optimise asthma control, and minimize 
future risk of exacerbations, lung funclion impairment and side- 
efFfX'ts to therapy. The characteristics of these subjects have been 
described previously [21-25]. The severe refractory patients were 
defined by being on oral steroids 50% of the year or on high dose 
inhaled steroids plus add on medications either at baseline or 
follow-up. 

The second cross-sectional validation cohort was recruited from 
7 centres including the original 4 centres as well as Gartnavel 
Hospital and Stobhill Hospital in Glasgow, and the Heart of 
England Hospital, Birmingham. Written informed consent for 
data entry and analysis was obtained from all subjects and the 
study was approved by each centre's local ethics committee. 

Statistical analysis 

Statistical analysis was carried out in R programming language 
and SPSS 10.0 for Windows. Multiple imputation algorithms were 
undertaken to allow for missing values. Variables were selected for 
use in the multiple imputation analyses that satisfied several 
conditions; they had less than 30% missing data and the missing 



data was either classified as missing completely at random or 
missing at random [26]. In order to determine which variables 
were deemed missing at random and not missing at random we 
used prior knowledge from multiple clinicians to determine if data 
missing was due to patient severity or due to random chance of 
missing information in medical records. Severity is the key 
mechanism for informative missing-ness for severe asthma as if 
patients are too severe some of the measurements are contrain- 
dicated. The cases in variables that were not missing due to 
severity of asthma were included in the multiple imputation 
algorithm. 

This reduced the total number of variables to 23 variables on 
which multiple imputation was carried out, see supplement for a 
list of variables used and for more details on multiple imputation. 
Multiple imputation can be carried out for up to 70% missing data 
in some cases if the variables are missing completely at random, 
missing at random or if there is no bias from missing samples [26]. 
The missing variables are predicted using regression equations and 
5 missing values are added for each variable. This methodology 
obtains less biased parameter estimates than removing the patients 
with missing values or imputing with the mean [26]. The mi 
package in R was used for the implementation of the missing data 
prediction as it allowed for convergence of the model to be 
checked. Convergence is checked for both model parameters and 
data before deriving the missing values to obtain good practices 
and good predictions for missing values see online supplement. 

Factor analysis was carried out in SPSS using principal 
component methodology with varimax rotation to determine 
factor scores and eigen values for each factor. We use factor 
analysis to determine underlying clinical processes. The underlying 
processes can be thought of as the independent structure of the 
data. The factor analysis was carried out using the available non- 
categorical variables. The treatment variables oral steroid dose 
and Beclomethasone Dipropionate (BDP) equivalent dose inhaled 
corticosteroids w(-re transformed into a ranked treatment variable 
before factor analysis, with patients with the highest inhaled 
steroid treatment and highest oral steroid dose having the highest 
rank. The underlying factors from the factor analysis were saved as 
variables describing independent asthma pathology using the 
derived factor scores. The factor scores were then used as input 
variables for the cluster analysis to determine sub-groups within 
the differing asthma pathology. 

The clustering was carried out using a two way cluster/mixture 
analysis [27,28] model that compares models for 1 to 15 clusters 
using the Bayesian information criterion. The input variables for 
the cluster analysis were the output latent factors from the previous 
factor analysis. A cluster analysis was carried out for each of the 5 
imputed datasets with 5 cluster membership as outcomes. The 5 
cluster membership were combined using the same two-way 
cluster algorithm but for categorical data to obtain a global 
average cluster membership, for the many steps of the multiple 
imputation cluster analysis please sc'e Figure El in File SI. 
Comparisons between clusters were made using the most 
appropriate test: one-way analysis of variance (ANOVA) for 
normally distributed data, -test for proportional data and 
Kruskal-Wallis test for non-normally distributed variables. The 
non-missing data for each variable was used when testing. The 
follow up data was analysed by applying paired t-tests or WUcoxon 
matched pair signed rank tests as appropriate to determine 
differences over time for each cluster, general estimating equations 
(GEE) models were also used but not presented here. 

After cluster membership is determined a classifier can be 
created. A classifier uses cluster specific statistics such as the mean 
and standard deviation of key variables to determine cluster 
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Table 1. Factor loading for continuous variables from first imputed data set, other factor analysis on the other 4 imputed datasets 
showed similar structure and factor loading. 







Continuous Factors 












1 


2 


3 


4 


5 




Airflow Obstruction 


Exacerbation 


IgE/Obesity 


Treatment scaling 


Blood Eosinophilia 


Cumulative Percentage of 
variance explained by factors 


21.3% 


37.5% 


51.4% 


62.6% 


72.1% 


BMI 


.050 


.205 


.651 


.346 


.005 


Pre-bronchodilator FEVi % 
predicted 


.990 


-.004 


— .012 


-.070 


-.031 


Pre-bronchodilator FVC % 
□redicted 


.784 


-.036 


— .338 


.123 


.139 


Pre-bronchodilator FEVi/FVC 


.724 


.109 


.296 


-.195 


-.196 


Blood eosinophils x10^/L 


-.067 


.166 


-.155 


-.050 


.819 


Total IgEkU/L 


.070 


.068 


-.621 


.058 


.071 


Exacerbation frequency 
(number of rescue oral 
steroid courses in last year) 


.052 


.796 


-.063 


.247 


.092 


^lumber of primary care visits 
in last year 


.013 


.759 


.069 


.350 


.010 


Treatment 


-.090 


-.029 


.067 


.847 


-.077 


Age at onset of symptoms 


.043 


-.225 


.519 


-.071 


.572 
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membership for new patients, it is a stand-alone algorithm that 
unlilie the cluster analysis does not require the rest of the data to 
predict cluster membership. The clinical variables used as inputs 
depend on their predictive accuracy of cluster membership with 
the best predictors being kept into the classifier. A function in R 
was programmed to classify the patients into one of the five 
clusters, based on the specific cluster multivariate parameters that 
were found in the cluster analysis. Every cluster has a unique 
multivariate distribution based on a subset of input parameters. A 
patient is classified as belonging to the cluster with the most likely 
cluster multivariate characteristics for that patient, without the 
need of the rest of the data to determine cluster membership. 
Once classified the patients (supervised) cluster classification was 
compared with their original (unsupervised) cluster classification to 
determine percentage accuracy at baseline and stability at follow- 
up. The classifier was apphed to a second dataset to determine if 
the clusters were in similar proportions and had similar 
characteristics to the original dataset. A p value of <0.05 was 
taken as the threshold of statistical significance. For further details 
of the classifier see online supplement. 

Results 

Three-hundred and forty-nine subjects from 4 centres were 
included in the baseline analysis. The criteria for selection of the 
variables used in the factor analysis were thus: Variables had to 
have less than 30% missing data in the original cohort and were 
also non-categorical. Five factors were found for the continuous 
variables using the Kaiser criteria for selection of the number of 
factors. Five factors were found for the continuous variables, these 
factors or aspects can be described as representing airflow 
obstruction, exacerbation frequency, IgE/body mass index 
(BMI), treatment scaling and peripheral blood eosinophilia 
(Table 1). These five derived factor variables were used as input 
variables in to the cluster analysis. The optimal number of clusters 



that were identified for each multiple imputed dataset is shown in 
Table El in File SI with the BIC for each cluster model and 
imputed dataset in Figure E2 in File S 1 . The cluster memberships 
returned were all very similar with very similar number of clusters, 
although as previously discussed removal of missing data would 
give a more concrete cluster membership and number of clusters. 
Only using patients that had no missing data would create a 
selection bias. 

Membership was taken from each imputed dataset and used as 
a variable to carry out latent class analysis to determine mean 
cluster membership. A five cluster model best described the 
dataset. Their characteristics are as shown Table 2. 

Cluster 1 comprised 34% of the cohort; had early onset, atopic 
asthma with the highest intensive care (ITU) admissions, high 
exacerbation frequency, half the subjects receiving regular 
systemic corticosteroid therapy, and poor lung function but with 
the greatest bronchodUator reversibility. Cluster 2 comprised 2 1 % 
of the cohort; was an obese, late onset group with frequent 
exacerbations, over half the subjects receiving regular systemic 
corticosteroid therapy, near normal lung function and the highest 
depression score. The remaining three clusters comprised 15% 
each of the overall population. Cluster 3 subjects were mainly non- 
atopic with normal lung function, the fewest number of subjects 
receiving systemic corticosteroid therapy and infrequent exacer- 
bations. Cluster 4 was a late onset gxoup, with a markedly elevated 
peripheral blood eosinophilia and frequent exacerbations. Cluster 
5 had the worst lung function and the highest proportion of 
subjects receiving systemic corticosteroid therapy, but had the least 
frequent exacerbations. The proportions of subjects in each cluster 
varied between centres Table E2 in File S 1 . 

Subjects were followed up for a median (interquartile range) 3.1 
(1.9 to 5.5) years. The cluster-specific changes in clinical outcomes 
at follow-up are as shown (Table 3 and Figure 1), GEE models 
were also carried out and showed similar effects for the variables 
but are not presented here. At foUow-up the proportion of subjects 
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Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 

100 -| 

90 - 

Pre Bronchodilator 80 ~ 



FEVl % predicted 



Blood eosinophil 
count lOVL 




0.0 





Figure 1. Differences in clinical outcomes between baseline and follow-up for each cluster. Pre Bronchodilator FEV1 % predicted and 
Blood eosinophil count statistics are Mean (+/-SD). Exacerbation frequency outcomes are median (IQR) and percentage on oral steroids are 
percentages. 
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treated with oral corticosteroids increased in all groups with 
statistically significant increases for clusters 3 and 4. This was 
coupled with a concomitant increase in BMI, which was 
statistically significant in all but cluster 2. Clusters 1, 4 and 5 
had statistical improvement in forced expiratory volume in 1 
second (FEVi) Mean (SD) 0.28 (0.64), 0.28 (0.68), 0.20 (0.54) 
respectively. There was a statistically significant decrease in 
exacerbation frequency in clusters 1, 2 and 4 median (IQR) —2 
(—6,-0), —2 (—4, 1), —2 (—5, 0) respectively with an associated 
improvement in the Hospital Anxiety and Depression (HAD) score 
in clusters 1 for anxiety mean (SD) — 1.2 (3.1) and 2 for depression 
mean (SD) — 2.1 (3.5). The peripheral blood eosinophil counts in 
clusters 2 and 4 mean (SD), -0.08 (0.27), -0.53 (0.88) 
respectively. 

The classifier produced 79% correct cluster membership when 
compared with original cluster membership at baseline (Table 4) 
with Cluster 2 being the best predicted and Clusters 1 and 4 being 
predicted well. The classifier was used on data from the follow-up 
time point to determine if clusters remained consistent over time. 
The classification at the next time point compared to the baseline 
clustering was 52% (Table 5). Patients that were originally in 
cluster 2, the obese asthmatic group, were the most consistent over 



time. The largest changes were that a substantial proportion of the 
patients in clusters 1 , 4, and 5 moved into clusters 2 and 3 at follow 
up reflecting the changes in lung function, peripheral blood 
eosinophilia and exacerbation frequency. 

The classifier \vas also used to classify data on the new dataset of 
245 severe refractory asthma patients presenting to seven specialist 
centres including the 4 original centres. These new patients 
received a cluster annotation. The clusters were in similar 
proportions in the new dataset and shared similar properties to 
the original cohort (Table 6).The proportions of subjects in each 
cluster were different across the centres (Table E2 in File SI). The 
cluster specific variables were tested between each dataset to 
determine cluster similarities across the two datasets (Table E3 in 
FUe SI). 

Discussion/Conclusions 

We describe here for the first time the application of statistical 
clustering in a multi-centre- longitudinal observational study of the 
British Thoracic Society Severe refractory Asthma Registry. To 
date this is the largest group of severe refractory asthma patients to 
be included in a cluster analysis. Five clusters were identified and 
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Table 4. Cluster and Classification membership compared at baseline. 



Predicted cluster classification for baseline data 



Cluster 3 'normal 

Baseline Cluster 1 'Early Cluster 2 'Obese, lung function least Cluster 4 'late Cluster 5 'Airflow 

cluster onset, atopic' late onset' severe asthma' onset, eosinophilic' obstruction' 



1 


80.0 


7.1 


5.9 


3.5 


3.5 


2 


4.4 


88.9 


2.2 


4.4 


0.0 


3 


2.6 


10.3 


74.4 


2.6 


10.3 


4 


5.3 


2.6 


2.6 


86.8 


2.6 


5 


16.7 


10.0 


13.3 


0.0 


60.0 
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using a classifier we have validated these clusters in an 
independent cohort of patients submitted to the Registry. We 
have determined phenotypic-specific changes over time and 
applied the classifier to report for the first time cluster membership 
stability over time, when treatment is optimised. Stability of cluster 
membership at foUow-up was 52% for the whole group ranging 
from 25-7 1 % within clusters. Therefore, statistical cluster analysis 
can identify distinct phenotypes with specific outcomes, but the 
stability of cluster membership is poor. 

The clusters were obtained by carrying out statistical mixture 
models, where a different number of clusters are tried iteratively 
and compared using the model fitting criteria Akaike information 
criterion and the Bayesian information criterion to obtain the 
statistically best fitting number of clusters. This clustering method 
allows the number of clusters to be obtained objectively rather 
than relying on graphs or expert opinion. The five clusters 
identified were Cluster 1 (34%) the most atopic with early onset 
disease, cluster 2 (21%) obese with late onset disease, cluster 3 
(15%) least severe disease, cluster 4 (15%) the most eosinophilic 
with late onset disease and cluster 5 (15%) severe airflow 
obstruction. To further validate these clusters a classifier was 
created to predict cluster membership for the original dataset and 
was applied to a new dataset of independent patients recruited to 
the British Thoracic Society Severe refractory Asthma Registry. 
This classifier was able to assign cluster membership with 79% 
accuracy. The 5 clusters in the validation group were of similar 
proportions and had very similar clinical characteristics. 

It is worth noting that all conclusions from the data are found 
using the cluster and classifier methodology. We present an un- 
biased statistical cluster analysis technique which selects the 
number of clusters based on the fit if the data. Bias could come 



from the choice of variables to include in the cluster analysis we 
chose asthma related-variables that had minimal missing values. 
The classifier plays a crucial role in monitoring asthma stability, 
we used the classifier that best derived the clusters from the data to 
obtain the best results for all clusters but the conclusions are 
limited to the accuracy of the classifier which was good in this case. 

Although cluster instability could be due to many factor such as 
cluster methodology, classifier accuracy, underlying disease 
variability or response to treatment. The cluster instability we 
found was associated with the clinical characteristics in patients in 
clusters significantly changing over foUow up. This significant 
change caused them to look like other cluster characteristics found 
at baseline, suggesting that the cause of cluster instability is due to 
either the disease variability or response to treatment and not due 
to statistical methodology. 

Our confidence that these clusters are robust is also derived 
from comparison with other clusters described in the literature. 
The first multi-centre cluster analysis of asthma undertaken in 
North America the Severe Asthma Research Network (SARP) [6] 
also found 5 clusters, but only 3 of these represented patients with 
severe disease. There were similarities with our clusters with the 
identification of an obese female predominant cluster, a cluster 
with moderate airflow obstruction and another with severe airflow 
obstruction. In the first cluster analysis of severe asthma 
undertaken in a single-centre study in Leicester, UK Haldar et 
al [5] also described 5 clusters annotated as 'early-onset atopic', 
'obese female', 'benign asthma', 'inflammation predominant' and 
'early symptom predominant', which was replicated in an 
independent group of patients from the same centre [7] . The first 
4 clusters match extremely well with the clusters described here in 
this multi-centre study. Spirometry was not included in the cluster 



Table 5. Cluster memberships at baseline compared to classification membership at follow up. 



Predicted clusters from year follow up data 



Baseline 
cluster 


Cluster 1 'Early 
onset, atopic' 


Cluster 2 'Obese, 
late onset' 


Cluster 3 'normal lung 
function least severe 
asthma' 


Cluster 4 'late onset, 
eosinophilic' 


Cluster 5 'Airflow 
obstruction' 


1 


52.2 


15.2 


10.9 


13.0 


8.7 


2 


3.2 


71.0 


12.9 


3.2 


9.7 


3 


6.7 


40.0 


46.7 


6.7 


0.0 


4 


25.0 


15.0 


15.0 


25.0 


20.0 


5 


12.5 


18.8 


18.8 


0.0 


50.0 
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analysis in the Haldar el al study and we did not have data for the 
asthma control questionnaire in sufficient subjects to include in the 
analysis and this is likely to explain the small differences in cluster 
characteristics between these studies. Taken together these studies 
do suggest that cross-sectionaUy these cluster phenotypes are 
robust across western European and North American populations 
of refractory/ severe asthma, although importandy the proportions 
vary between centres which is likely a reflection of referral 
patterns, but possibly due to real geographical differences. 
Interestingly, recent data from a multi-centre Korean study 
identified four refractory asthma clusters [8] that were mostiy 
different to our study or the previously reported UK and USA 
studies suggesting possible influences of geography and ethnicity. 

The potential utility of the identification of refractory/severe 
asthma clusters is whether they represent distinct endotypes with 
different underlying aetiology and immunopathogenesis and 
whether they predict responses to therapy and natural history 
([1,3], and [4]). We did not investigate the potential immunolog- 
ical or pathophysiological mechanisms for the clusters and this 
warrants further investigation. The patients were managed 
clinically and follow-up data was available at a single time point 
with a median time of follow-up of 3 years. At follow-up the 
proportion of subjects treated with oral corticosteroids increased in 
all groups with significant increases for cluster 3 and 4. This was 
coupled with a concomitant increase in BMI, which was significant 
in all but cluster 2. Clusters 1, 4 and 5 had statistical improvement 
in FEVi, but this change was not closely associated with changes 
in therapy or peripheral blood eosinophil count. The disconnect 
between eosinophilic inflammation and lung function is well 
described [4] , but the inconsistency between improvement in lung 
function and increased therapy may reflect changes in adherence 
to therapy during the study, which was not formally assessed 
across centres, or perhaps in part due to regression towards the 
mean in the clusters with more impaired lung function. 
Exacerbation frequency decreased significantly in clusters 1, 2, 
and 4 and was associated with a significant fall in the peripheral 
blood eosinophil count in clusters 2 and 4 and a small non- 
significant decrease in cluster 1 . In contrast, in cluster 5 there was 
a fall in the eosinophil count with a small increase in exacerbation 
frequency, but this small increase in exacerbation frequency is 
perhaps more likely to reflect regression to the mean than a 
clinically important difference. An association between eosino- 
philic inflammation and exacerbations is consistent with previous 
observations that eosinophilic inflammation is a good predictor of 
response to corticosteroid therapy [29] and interleukin-5 mono- 
clonal antibody therapy [30,31], and that the reduction in 
exacerbation frequency is also related to the reduction in 
peripheral blood eosinophilia. 

Cluster membership stability was assessed at follow-up using the 
classifier. This was 52% for the whole group with stability being 
best in cluster 2 (71%) and worst in cluster 4 (25%). The largest 
changes were that a substantial proportion of the patients in 
clusters 1, 4, and 5 moved into clusters 2 and 3 at foUow up 
reflecting improvements in lung function, peripheral blood 
eosinophilia, and exacerbation frequency with increasing obesity. 
The instability of the cluster membership is perhaps unsurprising 
in a disease characterised by variability over time and in a real 
world study in which interventions are instigated as per clinical 
guidelines rather than in a controlled study. This is not 
problematic when the instability of the phenotype is used to 
direct therapy at the time of an assessment for example with the 
use of a sputum eosinophilia to titrate corticosteroid therapy [32]. 
However, if emerging phenotype-specific therapies [33] require 
knowledge of phenotype stability within a patient this will require 
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an understanding of the dynamics of a patient's phenotype using 
repeated assessments over time and possibly in response to 
exacerbation events and therapeutic interventions. 

One major criticism of this study is that although several 
domains of the disease were assessed the asthma control and 
quality of life measures were not recorded systematically in all 
centres and therefore were insufficient for inclusion in this analysis. 
Changes in asthma control have a substantial impact upon 
morbidity and chnical decisions and their absence here does limit 
interpretation of why therapies were changed in some of the 
clusters. There is increasing recognition that poor adherence is 
common in severe refractory asthma, although its identification 
and management is complex [34] . The disconnect between some 
of the clinical outcomes and change in therapy would have been 
better informed in light of adherence rates in the clusters. 
Importantly, cluster membership is defined by what is measurable 
at the time of the study. In order to unlock other possible sub- 
groups or verify cluster immunopathogenesis data at other scales 
of the disease for example from tissue samples, sputum, protein or 
transcriptomic signatures from the host and analysis of environ- 
mental exposure such as the microbiome may shed more light 
upon the mechanisms driving the clusters. Although this report is 
amongst the largest groups of severe asthmatics studied our 
findings still require further validation in larger populations across 
different geographical locations with different health care systems. 
In particular, we found differences across centres in a single 
country and the reasons for this variability needs to be 
investigated. Additionally, the follow-up was limited to a single 
assessment and further work more carefully examining the 
dynamics of the severe refractory asthma phenotypes are required. 
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